methratio file in  SQLShare
https://sqlshare.escience.washington.edu/sqlshare#s=query/sr320%40washington.edu/BiGill_methratio_v9_A.txt



---- 

Having look at raw output:



Based on methratio - NA methratio always 

C     CT     revG     revGA
0       #     0     #


CI_lower and CI_upper also NA



---

7.1 methratio.py

python script to extract methylation ratios from BSMAP mapping results. Require python 2.X. 

For human genome, methratio.py needs ~26GB memory.  

For systems with limited memory, user can set the -c/—chr option to process specified chromosomes only,

and combine results for all chromosomes afterwards.

Usage: python methratio.py [options] BSMAP_MAPPING_FILES

BSMAP_MAPPING_FILES could be one or more output files from BSMAP.

The format will be determined by the filename suffix. 

(SAM format for *.sam and *.bam, BSP format for other filenames.)

Options:

  -h, —help            show this help message and exit

  -o FILE, —out=FILE   output file name. (required)

  -d FILE, —ref=FILE   reference genome fasta file. (required)

  -c CHR, —chr=CHR     process only specified chromosomes. [default: all]

                        example: —chr=chr1,chr2 (this uses ~4.5GB compared with ~26GB for the whole genome)

  -s PATH, —sam-path=PATH

                        path to samtools. [default: none]

  -u, —unique          process only unique mappings/pairs.

  -p, —pair            process only properly paired mappings.

  -z, —zero-meth       report loci with zero methylation ratios.

  -q, —quiet           don’t print progress on stderr.

  -r, —remove-duplicate

                        remove duplicated mappings to reduce PCR bias. 

            (This option should not be used on RRBS data. For WGBS, sometimes 

            it’s hard to tell if duplicates are caused by PCR due to high seqeuncing depth.)

  -t N, —trim-fillin=N

                        trim N fill-in nucleotides in DNA fragment end-repairing. [default:2] 

            (This option is only for pair-end mapping. For RRBS, N could be detetmined by the distance between

                        cuttings sites on forward and reverse strands. For WGBS, N is usually between 0~3.) 

  -g, —combine-CpG     combine CpG methylaion ratio from both strands. [default: False]

  -m FOLD, —min-depth=FOLD

                        report loci with sequencing depth>=FOLD. [default: 1]

  -n, —no-header       don’t print a header line

  -i CT_SNP, —ct-snp=CT_SNP

                        how to handle CT SNP (“no-action”, “correct”, “skip”),

                        default: “correct”.

                        “correct”:      correct the methylation ratio according to the C/T SNP information

                        estimated by the G/A counts on reverse strand, see the output format below for details.

                        “skip”:         do not report loci with C/T SNP detected (i.e. detected A on reverse strand)

                        “no-action”:    do not consider C/T SNP.

Output format: tab delimited txt file with the following columns:

    1) chromorome

    2) coordinate (1-based)

    3) strand

    4) sequence context (2nt upstream to 2nt downstream in Watson strand direction)

    5) methylation ratio, calculated as #C_counts / #eff_CT_counts

    6) number of effective total C+T counts on this locus (#eff_CT_counts) 

            CT_SNP=”no action”, #eff_CT_counts = #CT_counts

            CT_SNP=”correct”, #eff_CT_counts = #CT_counts * (#rev_G_counts / #rev_GA_counts)

    7) number of total C counts on this locus (#C_counts)

    8) number of total C+T counts on this locuso (#CT_counts)

    9) number of total G counts on this locus of reverse strand (#rev_G_counts)

    10) number of total G+A counts on this locus of reverse strand (#rev_GA_counts)

    11) lower bound of 95% confidence interval of methylation ratio, calculated by Wilson score interval for binomial proportion.

    12) upper bound of 95% confidence interval of methylation ratio, calculated by Wilson score interval for binomial proportion.

Example:

    python methratio.py —chr=chr1,chr2 —ref=hg19.fa —out=methratio.txt rrbsmap_sample*.sam

    python methratio.py -d mm9.fa -o meth.txt -p bsmap_sample1.bsp bsmap_sample2.sam bsmap_sample3.bam 

    python methratio.py -s /home/tools/samtools -t 1 -d arab.fa -o meth.txt bsmap_sample.sam

Note: For overlapping paired hits, nucleotides in the overlapped part should be counted only once instead of twice.

methratio.py can correctly handle such cases for SAM format output, but for BSP format it will still be counted twice,

because the BSP format does not contain mapping information of the mate.

@1 day ago